CSCI E-25¶

Filtering Images¶

Steve Elston¶

Introduction¶

Preprocessing of images is generally an essential step in most computer vision applications. Preprocessing typically involves filtering of one type or another which is out focus here. Here we will focus on two types of filters.

  1. Filters for noise elimination and edge detection.
  2. Morphology filters.

To get started execute the code in the cell below to import the packages you will need.

In [12]:
import skimage 
from skimage import data
from skimage.filters.rank import equalize
import skimage.filters as skfilters
import skimage.morphology as morphology
import skimage.transform as transform
from skimage.color import rgb2gray
from skimage import exposure
from skimage.transform import pyramid_gaussian
import numpy as np
from scipy import fft
import math
from matplotlib.colors import LogNorm
import matplotlib.pyplot as plt
%matplotlib inline
In [13]:
skimage.__version__
Out[13]:
'0.20.0'

Load and Prepare Image¶

For the exercises you will work with a complex image of a striped cat. Execute the code in the cell below to load and display the image.

In [14]:
cat_image = data.cat()
print('Image size = ' + str(cat_image.shape))
_=plt.imshow(cat_image)
Image size = (300, 451, 3)

This image is complex. The image shows typical typical facial features of a cat, eyes, nose, whiskers, etc. The cat's face also has a complex pattern of strips and patches of different colors. Additionally, there are complex features in the background, including the cat's chest patch, part of an ear and blurred patches that appear to include the cat's flank. Preparing such a complex image for further processing can be quite challenging.

Exercise 2-1: You will now prepare the cat image for further processing by the following steps:

  1. Convert the 3-channel color image to gray-scale.
  2. Display the gray-scale image along with the distribution of pixel values using the functions provided.
  3. Apply adaptive histogram equalization to the gray-scale image. Name the resulting image cat_grayscale_equalized.
  4. Display the equalized gray-scale image along with the distribution of pixel values using the functions provided.
In [52]:
def plot_grayscale(img, h=8):
    plt.figure(figsize=(h, h))
    _=plt.imshow(img, cmap=plt.get_cmap('gray'))
    plt.show()
    
def plot_gray_scale_distribution(img):
    '''Function plots histograms a gray scale image along 
    with the cumulative distribution'''
    fig, ax = plt.subplots(1,2, figsize=(12, 5))
    ax[0].hist(img.flatten(), bins=50, density=True, alpha=0.3)
    ax[0].set_title('Histogram of image')
    ax[0].set_xlabel('Pixel value')
    ax[0].set_ylabel('Density')
    ax[1].hist(img.flatten(), bins=50, density=True, cumulative=True, histtype='step')
    ax[1].set_title('Cumulative distribution of image')  
    ax[1].set_xlabel('Pixel value')
    ax[1].set_ylabel('Cumulative density')  
    plt.show()

# Convert the 3-channel color image to gray-scale
cat_grayscale = rgb2gray(cat_image)

plot_grayscale(cat_grayscale)
plot_gray_scale_distribution(cat_grayscale)

# Apply adaptive histogram equalization to the gray-scale image
cat_grayscale_equalized = exposure.equalize_adapthist(cat_grayscale)

plot_grayscale(cat_grayscale_equalized)
plot_gray_scale_distribution(cat_grayscale_equalized)

Answer the following questions:

  1. How has the adaptive histogram equalization changed the image and is this change an improvement from a computer vision point of view?
  2. How has the adaptive histogram equalization changed the distribution of pixel values? End of exercise.

Answers:

  1. We observe that the adaptive histogram equalization enhances the local contrast of the gray-scale image especially in regions of low contrast, which makes the image become more pronounced. The brightness of the image is also enhanced.
  2. Applying the adaptive histogram equalization spreads out the pixeel values across the histogram more evenly. This means that regions with low contrast would see increase in intensity values, while intensity values might be compressed in regions with high contrast. The cumulative density chart is also more flattened compared to that for the gray-scale cat image.

Important note!: For subsequent exercises, unless otherwise stated, your starting point should be the cat_grayscale_equalized image just created. This is in keeping with good practice of starting with a properly adjusted image before applying any CV algorithms.

Power Spectrum of Images¶

A useful way to understand the effect of a filter on the image is to examine the power spectrum of the image. This is done by decomposing the image into its sinusodial components of different frequencies. This decomponsition is know as a Fourier transform. We will not purse any detail in this discussion, going only far enough to gain a basic ability to interpret the results.

To compute a Fourier transform on a discretely sampled data, such as digital image data, the discrete Fourier transform (DFT). For a one dimensional sequential sample, $\mathbf{x} = [x_1, x_2, x_3, \ldots, x_n]$, we can decompose this sequence into a sequence of complex frequency components, $\mathbf{\Omega} = [\omega_0, \omega_1, \omega_2, \ldots, \omega_{N-1}]$. The basic DFT for one frequency component, $X_k$, is formulated as:

$$\Omega_k = \sum_{n=0}^{N-1} x_n \cdot e^{-\frac{i2\pi}{N}kn}$$

How can we understand this difficult relationship. First, recall that the $i$ symbol indicates a complex number. Next, it is useful to note that we can represent a complex exponential function into sin and cos function using Euler's formula:

$$ e^{-i\ q} = cos(q) - i\ sin(q)\\ $$

Applying Euler's formula to the DFT, we get the following relation:

$$ \Omega_k = \sum_{n=0}^{N-1} x_n cos \Bigg(\frac{2 \pi}{N}kn \Bigg) - i\ x_n\ sin \Bigg( \frac{2 \pi}{N}kn \Bigg) $$

From the above relationships one can see that the DFT is a linear superposition of the complex sinusiods with the sequence $\mathbf{x}$. Each component, $X_k$, represents the a coefficeint for a specific frequency and phase. Another way to understand the DFT is as a convolution between the sinusods of different frequencies and phases and the sequence, $\mathbf{x}$. Phase is represented by the complex value.

For our purposes here, we are not interested in the complex representation. Rather, we are interested in the power spectrum. The power spectrum is the square of the magnitude of the spectral components. The power spectrum is symmetric about zero. Each frequncy component is represented on both the positive side and the negative side.

So far, we focued on one-dimensional sequences, $\mathbf{x}$. Of course, images are multi-dimensional. While we will not go into detials, one can easily extend the DFT to multiple dimensions.

Power spectrum example¶

To get a feel for how Fourier transforms work let's examine a simple one-dimensional example. The code in the cell below does the following:

  1. Computes a series of values with an impluse, or a close spaced set of edges, in the middle.
  2. The power transform is computed. The FFT is computed using numpy.fft.fft. To compute the real-valued power spectrum the square of the absolute value of the FFT is taken.
  3. The original series and the FFT is displayed.

Execute this code and examine the results.

In [21]:
edge_series = np.array([0.0]*256)
edge_series[126:130] = 1.0   
x = list(range(256))

edge_fft = fft.fft(edge_series)
edge_power = np.square(np.abs(fft.fftshift(edge_fft)))

_,ax = plt.subplots(2,1, figsize=(8,6))
ax[0].plot(x, edge_series);
ax[1].plot(x, edge_power);

Examine the results and notice the following:

  1. As expected the impuse is quite localized.
  2. The power spectrum has the following properties:
    a. The center of the power spectrum at 128 represents zero frequency.
    b. The real-valued power spectrum is symmetric about 0.
    c. The power of each frequency is represented twice.
    d. Frequency above the 0 (center) increase from the center toward the right.
    e. At the same time, frequency below the 0 (center) increase from the center toward the left.
  3. The power spectrum function of the impluse is quite wide.
    a. The maximum value is at 0 frequency, at the center.
    b. The power spectrum does not completely die to 0 until the Nyquist frequency is reached.

We have just examine the properties of a Fourier transform of a impluse function. But what happens when we take the Fourier transform of the smoothed series generated from the Fourier transform of the impulse. To find out, execute the code in the cell below.

In [22]:
sigmoid_fft = fft.fft(edge_power)
sigmoid_power = np.square(np.abs(fft.fftshift(sigmoid_fft)))

_,ax = plt.subplots(2,1, figsize=(8,6))
ax[0].plot(x, edge_power);
ax[1].plot(x, sigmoid_power);

Examine these results. The Fourier transform of the sigmoidal function is a narrow spike, indicating the frequncy spectrum has only a few components.

The important points of these example is:
a. Series with rapid changes (edges) have a broad frequency spectrum.
b. Smoothed functions have a narrow frequency spectrum.

These points are most helpful in analyzing the effect of filters that limit power spectrum.

2D DFT Example¶

Images are inherently multidimensional. Therefore, we will now investigate some key properties of 2-d Fourier transforms. We will start with a synthetic example.

The code in the cell below does the following:

  1. Generate a 2-d pattern with a mixture of two spatial frequencies.
  2. Display the synthtic image.
  3. Compute the 2-d power spectrum of the image and display the result. The FFT are shifted to the center of the array using the scipy.fft.fftshift function. The magnitude of the complex number is computed using the numpy.absolute function and then squared to units of power using the numpy.square function.

Execute the code in the cell below.

In [23]:
image_power = lambda x: np.square(np.abs(fft.fftshift(fft.fft2(x))))
#image_power = lambda x: np.square(np.absolute(fft.fft2(x)))

dim = 256
x = np.arange(0, dim*dim, 1)
sin_image = 1.0 - 0.25 * np.sin(np.divide(x, math.pi)).reshape((dim,dim)) \
            - 0.25 * np.sin(np.divide(x, 0.25* math.pi)).reshape((dim,dim))

_, ax = plt.subplots(1, 2, figsize=(8, 16))
ax[0].imshow(sin_image, cmap=plt.get_cmap('gray'));
ax[0].set_title('Original image');
ax[1].imshow(image_power(sin_image), norm=LogNorm(vmin=5));   
ax[1].set_title('Fourier transform of image')
plt.show()

Examine these results noticing the following:

  1. The original synthetic image shows the effect of the multiple frequencies creating a diagonal pattern.
  2. The DFT of the image is shown in on the right. The interpretation of the 2-dimensional power spectrum plot requires some explanation. The plot shows both vertical and horizontal frequency components. The Fourier transform values can be positive and negative, and are symmetric for the real-valued image. Notice the these additional properties of the DFT:
    a. There is a small bright dot at the center representing the zero frequency component, or average of the image pixel values.
    b. The display of the DFT shows 4 bright spots which represent the 2-d frequency compoents of the image. Notice that there is both a vertical and horizontal component, represented by the horizontal and vertical lines respectively. The further from the center a component is the higher the frequency. The maximum (at the Nyquist rate) vertical and horizontal frequency component is in the corners of the plot. Intermediate frequencies are between the center and the corners.
    c. As with the the 1-dimensional DFT, each frequency component is displayed twice. In the case of the 2-d DFT the upper right quandrant is a mirror image of the lower left quandrant and the lower right quadrant is a mirror image of the upper left quadrant.

Real image power spectrum example¶

We will now apply the DFT to compare the power spectrum of the unequalized and equalized gray scale cat images. To see the difference in the frequency components of the original image and the equalized image, execute the code in the cell below.

In [24]:
def plot_2d_fft(img1, img2):
    fig, ax = plt.subplots(1, 2, figsize=(15, 15))
    im = ax[0].imshow(image_power(img1), norm=LogNorm(vmin=5))    
    im = ax[1].imshow(image_power(img2), norm=LogNorm(vmin=5))        
    plt.show()
    plt.show()   

plot_2d_fft(cat_grayscale, cat_grayscale_equalized)     

Compare these displays. First, notice that much of the power (energy) in the unequalized (left) image is at or near 0 frequency, indicating that the values across the image are mostly constant. Second, you can see that equalization has amplified the higher frequency components for both dimensions of the image. These higher frequncy components are closer to the corners of the plot

Convolution in 1 dimension¶

To get started, let's work with the simplest case of one dimensional convolution. Convolution is a process of applying a filter of a certain span over one of more dimensions of the data.

To conceptually understand 1-d convolution examine Figure 1.1 below.. At each time step a value for output series is computed by applying the convolution operator to the input series. The convolution operator is a set of weights $\{ W_1, W_2, W_3 \}$ which are mutilied by the input values. The weighted inputs are then summed to compute the value of the output series. The operator is then moved along the input series (by the stride of 1 in this case). The procecss proceeds until the end of the series is reached.

Notice that the output series is shorter than the input series by $span - 1$. In the illustrated example the kernel has a span of 3, makig the output series length by 2. This reduction in the length of the output is inherent in all convolution operations.

1-d convolution with operator span of 3

Mathematically, we write the convolution operation:

$$s(t) = (x * k)(t) = \sum_{\tau = -\infty}^{\infty} x(t) k(t - \tau)$$

where,
$s(t)$ is the output of the convolution operation at time $t$,
$x$ is the series of values,
$k$ is the convolution kernel,
$*$ is the convolution operator,
$\tau$ is the time difference of the convolution operator.

Summing over the range $\{-\infty, \infty \}$ seems to be a problem. However, in practice the convolutional kernal has a finite span, $s$. For an operator with an odd numbered span, we can then rewrite the above equation as:

$$s(t) = (x * k)(t) = \sum_{\tau = {t - a}}^{t + a} x(\tau) k(t - \tau)$$

Where, $a = \frac{1}{2}(kernel\ span - 1)$, for an odd length kernel.

In other words, we only need to take the sum over the span of the convolution operator. Notice that this operator is non-causal, since values from ahead of the time step being operated on are used to compute the output.

Exercise 2-2: To make this all a bit more concrete, you will try a code example. You will construct a short filter and apply it over a 1-d series. In the cell below you will implement a simple 1-d convolution operation given a set of kernel weights. The core of this operation is a dot product between the subset of values of the series and the kernel weights. You will first use a 3-point moving average kernel, $[0.333, 0.333, 0.333]$. Do the following:

  1. Create a function that performs convolution of the moving average kernel with the random series. You can perform this operation by iteratively calling numpy.dot as you move the operator along overlapping 3-point segments of the series. Your function should have two arguments, the series and the kernel. the Make sure you account for the fact that for this operator with span of 3 (an odd number) the length of the result will be 2 points shorter than the original series, one point shorter on each end.
  2. Use the code provided to create a step-function series with random noise added.
  3. Execute your function and plot the result using the function provided.
  4. Finally, execute your function and plot the result using a 5-point Gaussian kernel $[0.05, 0.242, 0.399, 0.242, 0.05]$.
In [53]:
def plot_conv(series, conv, span):
    x = list(range(series.shape[0]))
    offset = int((span-1)/2)
    end = series.shape[0] - offset
    plt.plot(x, series, label = 'Original series')
    plt.plot(x[offset:end], conv, color = 'red', label = 'Convolution result')
    plt.legend()
    plt.xlabel('Index')
    plt.ylabel('Value')
    plt.title('Series of raw values and convolution')
    plt.show()

np.random.seed(12233)
series = np.concatenate((np.random.normal(size = 40), np.random.normal(size = 40) + 2.0))  

def conv_1d(series, kernel):
    conv_result = []
    # Calculate the offset based on the kernel size
    offset = len(kernel) // 2
    # Iterate over the series
    for i in range(offset, len(series) - offset):
        window = series[i - offset:i + offset + 1]
        # Compute the dot product between the window and the kernel
        result = np.dot(window, kernel)
        conv_result.append(result)
    return np.array(conv_result)

# 3-point moving average kernel
moving_average_kernel = np.array([0.333, 0.333, 0.333])

# Perform convolution with the 3-point moving average kernel
moving_average_conv = conv_1d(series, moving_average_kernel)
plot_conv(series, moving_average_conv, span=len(moving_average_kernel))

# 5-point Gaussian kernel
gaussian_kernel = np.array([0.05, 0.242, 0.399, 0.242, 0.05])

# Perform convolution with the 5-point Gaussian kernel
gaussian_conv = conv_1d(series, gaussian_kernel)
plot_conv(series, gaussian_conv, span=len(gaussian_kernel))

Answer the following questions:

  1. What effect can you observe when comparing the original and convolved series with respect to both overall mean and noise spikes?
  2. What reason(s) can you think of for the differences in the convolved series?
  3. Are both of these kernels properly normalized and why? End of exercise.

Answers:

  1. For the 3-point moving average kernel, the convolution operation smoothes out the original series by averaging values, effectively reducing the influence of outliers and noise on the overall mean. For the 5-point Gaussian kernel, the convolution operation similarly smooths the original series, but with a more pronounced effect due to the wider span of the Gaussian kernel, leading to a smoother curve with less variation. For both kernels, noise spikes or high-frequency fluctuations in the original series are dampened in the convolved series.The 5-point Gaussian kernel seems to provide more effective noise reduction compared to the 3-point moving average kernel from the above charts.

  2. The differences observed above in both convolved series can be attributed to variations in kernel characteristics, including size, shape, weight distribution, and span of influence.

  3. We have defined the 3-point moving average kernel as [0.333, 0.333, 0.333].The sum of the elements in the kernel is equal to 1, which indicates proper normalization. The 5-point Gaussian kernel is defined as [0.05, 0.242, 0.399, 0.242, 0.05]. The sum of the elements in the Gaussian kernel is 0.983, not exactly 1, due to the nature of the Gaussian distribution. So, we conclude that both kernels are properly normalized.

Applying 2-d Convolution¶

Next, let's look at how we can apply convolution operations in two dimensions. The theory is the same as the 1d case.

Figure 1.2 illustrates a simple example of applying a convolution operator to a 2-d input array. The 2-d input array can be a gray-scale image, for example.

A 3X3 convolutional operator is illustrated in Figure 1.2. This operator is applied to a small image of dimension of 4X4 pixels. Within the 4X4 image the operator is only applied four times, where the complete opertor lies on the image. Convolution where the operator is completely contained within the input sample is sometimes called call valid convolution.

In this case the 4X4 pixel input results in a 2X2 pixel output. You can think of the convolutional operator mapping from the center of the operator input to pixel in the output. Similar to the 1-d case, for a convolution operator with span $s$ in both dimensions the output array will be reduced in each dimension by $\frac{s + 1}{2}$

**Figure 1.2. Simple 2-d convolution operation**

For a convolution with the identical span $s$ in both dimensions we can write the convolution operations as:

$$S(i,j) = (I * K)(i,j) = \sum_{m = {i - a}}^{i + a} \sum_{n = j - a}^{j + a} I(i, j) K(i-m, j-n)$$

Notice that $S$, $I$ and $K$ are all tensors. Given this fact, the above formula is easily extended to higher dimensional problems.

The convoluational kernel $K$ and the image $I$ are comutitive so there is an alternative representation by applying an operation known as kernel flipping. Flipped kernel convolution can be represented as follows:

$$s(i,j) = (I * K)(i,j) = \sum_{m = {i - a}}^{i + a} \sum_{n = j - a}^{j + a} I(i-m, j-n) K(i, j)$$

Gaussian Filtering and Noise Reduction¶

A Gaussian filter computes a 2-dimensional weighted moving average as it is passed across the image. The weights are determined by a 2-dimensional truncated Gaussian function. The result is a lower-bandwidth or smoothed version of the image. The span of the Gaussian function is determined by the standard deviation, $\sigma$, of the Gaussian. The wider the span or the lower the bandwidth the more smoothing of the filtered image. The 2-dimensional Gaussian function is expressed:

$$G(x,y) = \frac{1}{2 \pi \sigma^2} e^{\frac{x^2 + y^2}{2 \sigma^2}}$$

The 2-d Gaussian filter is approximated by a convolutional kernal. As an example, we can formulate the $3 \times 3$ Gaussian kernal with $\sigma = 1$ as:

$$ K(i,j) = \frac{1}{16} \begin{pmatrix} 1 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 1 \end{pmatrix} $$

Exercise 2-3: The wider the span of the Gaussian filter the narrower the bandwidth, which has the effect of blurring the image. To see this effect with Gaussian filtered images of the gray-scale equalized cat image plot the filtered image in 4 rows of a plot array using values of $\sigma = [1.0,2.0,3.0,4.0]$. Use the skimage.filters.gaussian function. Make sure the titles of the images indicate the value of sigma. In the other column of each row display the 2-dimensional FFT of the filtered image.

In [54]:
fig, ax = plt.subplots(4,2, figsize=(15, 20))
ax = ax.flatten()

sigmas = [1.0, 2.0, 3.0, 4.0]

# Iterate over sigma values
for i, sigma in enumerate(sigmas):
    # Apply Gaussian filter to the grayscale equalized cat image
    filtered_image = gaussian(cat_grayscale_equalized, sigma=sigma)
    ax[i*2].imshow(filtered_image, cmap='gray')
    ax[i*2].set_title(f'Filtered Image (Sigma={sigma})')
    
    # Compute 2D FFT of the filtered image
    fft_image = fft2(filtered_image)
    fft_image_shifted = fftshift(fft_image)
    
    # Plot the 2D FFT
    ax[i*2 + 1].imshow(np.log(np.abs(fft_image_shifted)), cmap='gray')
    ax[i*2 + 1].set_title('2D FFT of Filtered Image')
    
plt.tight_layout()
plt.show()
  1. How does the blurring of the image and the bandwidth shown by the FFT change with the value of $\sigma$?
  2. Compare the FFTs of the filtered image to the original equalized image. How has the bandwidth of the image features changed with the value of $\sigma$?
    End of exercise.

Answers:

  1. We observe from the images above that increasing sigma in the Gaussian filter results in more significant blurring of the image and a narrower bandwidth in the FFT domain.
  2. From comparing the FFTs of the filtered image to the original equalized image, we observe that smaller sigma values preserve high-frequency components and maintain sharp image features in the frequency domain, resulting in broader bandwidths of image features. However, larger sigma values reduce high-frequency noise, leading to narrower bandwidths and less defined image features in the frequency domain.

Exercise 2-4: Examining the difference between the original image and filtered image can help you understand the effect of the filter. To create this display, do the following:

  1. Compute the Gaussian filtered image with $\sigma = 1.0$ from the gray-scale equalized cat image.
  2. Compute the difference between the original gray-scale equalized image and the Gaussian filtered image.
  3. Display the image of the difference and the distribution of the difference.
In [55]:
sigma = 1.0

filtered_image = gaussian(cat_grayscale_equalized, sigma=sigma)

difference_image = cat_grayscale_equalized - filtered_image

# Display the image of the difference
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(difference_image, cmap='gray')
plt.title('Difference Image')
plt.axis('off')

# Display the distribution of the difference
plt.subplot(1, 2, 2)
plt.hist(difference_image.flatten(), bins=50, color='blue', alpha=0.7)
plt.title('Distribution of Difference')
plt.xlabel('Pixel Value Difference')
plt.ylabel('Frequency')

plt.tight_layout()
plt.show()

Examine the plots and answer the following questions:

  1. What aspects of the cat image has the difference between the original and filtered images have been highlighted?
  2. What does the narrow range of pixel values of the difference tell you about the differences between the original and filter images?
    End of exercise.

Answers:

  1. We observe that the difference between the original and filtered images highlights regions where the Gaussian filter has altered the appearance of the cat image, particularly in terms of edge sharpness, texture, local contrast, and overall smoothness.
  2. The narrow range of pixel values in the difference image indicates subtle and localized differences between the original and filtered images as well as the impact of the Gaussian filter on specific regions and features within the image. The histogram is almost symmetrical with pixel value differences ranging from -0.3 to 0.3.

Gaussian Filtering with Shot Noise¶

One of the primary purposes of applying filtering to images is the suppression of noise. Many computer vision algorithms, particularly machine learning algorithms, have poor noise robustness. We will now investigate the use of filters for noise suppression.

A simple model of image noise is know as shot noise or salt and pepper noise. In this model, the noise is comprised of random pixel values at the extreme of the range. For example for unsigned integer values the noise has values of $[0, 255]$ or for floating point values, $[1.0, 1.0]$. To see an example of salt and pepper noise on the cat image execute the code in the cell below.

In [35]:
cat_grayscale_equalized_noise = np.copy(cat_grayscale_equalized).flatten() 
cat_grayscale_equalized_noise[np.random.choice(cat_grayscale_equalized_noise.size, 10000, replace=False)] = 0.0
cat_grayscale_equalized_noise[np.random.choice(cat_grayscale_equalized_noise.size, 10000, replace=False)] = 1.0
cat_grayscale_equalized_noise = cat_grayscale_equalized_noise.reshape(cat_grayscale_equalized.shape)

plot_grayscale(cat_grayscale_equalized_noise)
plot_gray_scale_distribution(cat_grayscale_equalized_noise)  

Examine the image noticing the light pixels with value of 1.0 and the dark pixes with value of 0.0 randomly spread throughout the image. Further, the distribution of pixel values show spikes at 0.0 and 1.0.

Execute the code in the cell below to display the FFTs of the equalized image and the image with the salt and pepper noise.

In [36]:
plot_2d_fft(cat_grayscale_equalized, cat_grayscale_equalized_noise)  

The image with the salt and pepper noise added has a much wider bandwidth. Further, the higher frequency components of the FFT of the salt and pepper noise image appears nearly uniform random.

We will now explore how filters might be able to improve the quality of this noisy image.

Exercise 2-5: You will now investigate how the noise added to the gray-scale cat image can be suppressed using a Gaussian filter. To changing the Gaussian filter span effects the image. To see this effect plot the Gaussian filtered image in 4 rows of a plot array using values of with $\sigma = [1.0,2.0,3.0,4.0]$. Make sure the titles of the images indicate the value of sigma. In the second column of each row display the 2-dimensional FFT of the filtered image.

In [56]:
sigmas = [1.0, 2.0, 3.0, 4.0]

fig, ax = plt.subplots(4, 2, figsize=(15, 20))
ax = ax.flatten()

# Iterate over sigma values
for i, sigma in enumerate(sigmas):
    # Apply Gaussian filter to the gray-scale cat image with noise
    filtered_image = gaussian(cat_grayscale + np.random.normal(scale=25, size=cat_grayscale.shape), sigma=sigma)
    ax[i*2].imshow(filtered_image, cmap='gray')
    ax[i*2].set_title(f'Filtered Image (Sigma={sigma})')
    
    # Compute 2D FFT of the filtered image
    fft_image = fft2(filtered_image)
    fft_image_shifted = fftshift(fft_image)
    
    # Plot the 2D FFT
    ax[i*2 + 1].imshow(np.log(np.abs(fft_image_shifted)), cmap='gray')
    ax[i*2 + 1].set_title('2D FFT of Filtered Image')
    
plt.tight_layout()
plt.show()

Examine the images. How can you describe the trade-off between the span (or bandwidth) of the Gaussian filter, the suppression of the salt and pepper noise, and the resolution of the image?
End of exercise.

Answer: From the above outputs, we observe that there is a trade-off between the span of the Gaussian filter, the suppression of salt and pepper noise, and the resolution of the image. Higher sigma values lead to more effective noise suppression but may result in loss of image resolution and fine details due to increased blurring. Lower sigma values preserve better image resolution but may not suppress noise as effectively.

Exercise 2-6: Somewhat subjectively, the image with $\sigma=3$ is a reasonable trade off between blurring and noise suppression. The next question to address is what the filtering has done to the distribution of pixel values and the bandwidth of the image. To find out do the following:

  1. Compute a filtered version of the noisy gray-scale cat image with $\sigma=3$
  2. Plot the distribution of the pixel values of the filtered image.
  3. Display the FFT of the noisy image next to the filtered image.
In [57]:
sigma = 3.0

filtered_image = gaussian(cat_grayscale + np.random.normal(scale=25, size=cat_grayscale.shape), sigma=sigma)

# Plot the distribution of the pixel values of the filtered image
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.hist(filtered_image.flatten(), bins=50, color='blue', alpha=0.7)
plt.title('Distribution of Pixel Values (Filtered Image)')
plt.xlabel('Pixel Value')
plt.ylabel('Frequency')

# Display the FFT of the noisy image next to the filtered image
plt.subplot(1, 2, 2)
fft_noisy_image = fft2(cat_grayscale + np.random.normal(scale=25, size=cat_grayscale.shape))
fft_noisy_image_shifted = fftshift(fft_noisy_image)
plt.imshow(np.log(np.abs(fft_noisy_image_shifted)), cmap='gray')
plt.title('FFT of Noisy Image')

plt.tight_layout()
plt.show()

Examine the plots and answer the following questions:

  1. When comparing the distribution of pixel values of the filtered image to the noisy image what change do you see? Given the properties of the Gaussian filter, how can you explain this change?
  2. Noting the differences in the FFTs of the filtered and noisy images, what has the filter done to the frequency components of the image? Given the properties of the Gaussian filter, how can you explain this change?
    End of exercise.

Answers:

  1. The Gaussian filter, applied with sigma = 3, significantly reduces the spread of pixel values and suppresses high-frequency noise in the image, resulting in a more centralized distribution of pixel values compared to the noisy image. The filter leads to a concentration of pixel values around the mean intensity.
  2. The filter has suppressed high-frequency components in the image's FFT while preserving low-frequency information. The filter's Gaussian kernel averages neighboring pixel values, blurring sharp transitions and suppressing high-frequency noise.

Gaussian Filtering with Gaussian Noise¶

We will now add Gaussian noise to the cat image. Execute the code in the cell below and examine the results.

In [42]:
sigma = 0.1
cat_grayscale_equalized_gaussian = np.copy(cat_grayscale_equalized).flatten() 
cat_grayscale_equalized_gaussian = cat_grayscale_equalized_gaussian + np.random.normal(loc=0.25, scale=sigma, size=len(cat_grayscale_equalized_gaussian))
cat_grayscale_equalized_gaussian = cat_grayscale_equalized_gaussian.reshape(cat_grayscale_equalized.shape)

plot_grayscale(cat_grayscale_equalized_gaussian)
plot_gray_scale_distribution(cat_grayscale_equalized_gaussian)  
_=plt.imshow(image_power(cat_grayscale_equalized_gaussian), norm=LogNorm(vmin=5))

The Gaussian noise is evident in the image. Notice how the Gaussian noise has changed the distribution of the pixel values and results in a broad spectrum.

For this example, we will ignore the fact that the pixel values are out to the usual range.

Exercise 2-7: You will now apply Gaussian filters to the gray-scale cat image with Gaussian noise. The filters will have $\sigma$ values of $[1.0, 2.0, 3.0, 4.0]$. Make sure the titles of the images indicate the bandwidth value, $\sigma$. In the second column of each row display the 2-dimensional FFT of the filtered image.

In [58]:
fig, ax = plt.subplots(4,2, figsize=(15, 20))
ax = ax.flatten()

sigmas = [1.0, 2.0, 3.0, 4.0]

for i, sigma in enumerate(sigmas):
    # Apply Gaussian filter to the noisy gray-scale cat image
    filtered_image = gaussian(cat_grayscale + np.random.normal(scale=25, size=cat_grayscale.shape), sigma=sigma)
    ax[i*2].imshow(filtered_image, cmap='gray')
    ax[i*2].set_title(f'Filtered Image (𝜎={sigma})')
    
    # Compute 2D FFT of the filtered image
    fft_image = fft2(filtered_image)
    fft_image_shifted = fftshift(fft_image)
    
    # Plot the 2D FFT
    ax[i*2 + 1].imshow(np.log(np.abs(fft_image_shifted)), cmap='gray')
    ax[i*2 + 1].set_title('2D FFT of Filtered Image')

plt.tight_layout()
plt.show()
/var/folders/06/tcq9vzm15837hsvfb646g0gh0000gp/T/ipykernel_5473/729351947.py:20: UserWarning: Glyph 120590 (\N{MATHEMATICAL ITALIC SMALL SIGMA}) missing from current font.
  plt.tight_layout()
  1. How does the blurring of the image and the bandwidth shown by the FFT change with the bandwidth of the Gaussian filter?
  2. Compare the FFTs of the filtered image to the original equalized image. How has the bandwidth of the image features changed?
    End of exercise.

Answer:

  1. As bandwidth of the Gaussian filter increases i.e. the sigma value increases, the image becomes more blurred, with fine details smoothed out; and high-frequency components are suppressed, leading to a narrower bandwidth in the image's FFT. The FFTs show a concentration of energy at lower frequencies as high-frequency information is reduced.
  2. The FFTs of the filtered images exhibit a reduction in high-frequency components compared to the original equalized cat image. This reduction indicates a narrowing of the bandwidth of image features.

Median Filtering¶

You have now explored the basic properties of Gaussian filters. Now we will turn our attention to the median filter which has noticeably different properties. The median filter is simple in principle. The filter replaces the pixel value at the center of the filter operator with the median value over the filter area.

As with the Gaussian filter the span of the median filter must be selected. The this span is almost always an odd number so there is a central pixel. The greater the span the greater the lower the bandwidth of the filter. Lower bandwidth has the effect of reducing noise, but increasing the blurring effect.

Exercise 2-8: You will now explore how changing the median filter span effects the image. To see this effect, start with the gray-scale cat image with added Gaussian noise and then plot the median filtered image in 4 rows of a plot array using values of span $= [3, 5, 7, 9]$. Make sure the titles of the images indicate the span value. In the second column of each row display the 2-dimensional FFT of the filtered image.

In [59]:
fig, ax = plt.subplots(4,2, figsize=(15, 20))
ax = ax.flatten()

spans = [3, 5, 7, 9]

for i, span in enumerate(spans):
    # Apply median filter to the noisy gray-scale cat image
    median_filtered_image = medfilt2d(cat_grayscale + np.random.normal(scale=25, size=cat_grayscale.shape), kernel_size=span)
    ax[i*2].imshow(median_filtered_image, cmap='gray')
    ax[i*2].set_title(f'Median Filtered Image (Span={span})')
    
    # Compute 2D FFT of the median filtered image
    fft_image = fft2(median_filtered_image)
    fft_image_shifted = fftshift(fft_image)
    
    # Plot the 2D FFT
    ax[i*2 + 1].imshow(np.log(np.abs(fft_image_shifted)), cmap='gray')
    ax[i*2 + 1].set_title('2D FFT of Filtered Image')

plt.tight_layout()
plt.show()
  1. How does the blurring of the image and the bandwidth shown by the FFT change with the span of the median operator?
  2. Compare the FFTs of the filtered image to the original equalized image. How has the bandwidth of the image features changed?
    End of exercise.

Answer:

  1. We observe that as the span of the median operator increases, the image becomes more blurred due to the wider neighborhood considered.
  2. The FFTs of the filtered images show a decrease in bandwidth, indicating suppression of high-frequency noise and details.

Convolution with Separable Kernels¶

We now turn our attention to efficient computation of 2-dimensional convolutions. If we directly applied a $k \times k$ convolutional operator, each point computed required $k^2$ multiplication and addition operations. Is there a way we can make this operation more efficient? Yes! Use a separable kernel.

Most linear convolutional operators can be separated into two 1-dimensional kernels, one horizontal and one vertical. From the principle of linear superposition, the convolution of the two kernels with the 2-dimension image is identical to the original 2-dimensional convolution. Using a separable kernel requires only $2 k$ multiplication and addition operations per point, a significant improvement over the 2-dimensional operator.

Consider the example of a $5 \times 5$ truncated Gaussian kernel:

$$ 2d\ kernel = \frac{1}{256} \begin{bmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6 & 24 & 36 & 24 & 6 \\ 4 & 16 & 24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix} $$

The Gaussian kernel is the only linearly separable kernel which is also radially symmetric. This kernel can be separated into two 1-dimensional kernels, horizontal vertical with approximate values: $[1, 4, 6, 4, 1]$. As a result, the equivalent to the computationally complex 2-d filter can be applied by two passes (vertical and horizontal) of the computationally efficient 1-d filters.

Exercise 2-9: You will now apply the separable 1-d kernels to the noisy gray-scale cat image by the following steps:

  1. First, you will verify that the separable kernels are indeed a good approximation of the 2-d kernel. Compute the outer product of the 1-d kernel with itself. Display this result, along with the nomalizaiton constant, which you can compute as kernel_norm = np.sum(d2_kernel).
  2. In the next code cell create a function that applies the 1-d kernel to the rows of the image. Your function should have two arguments, the image and the 1-d kernel, and will return the result. Keep in mind that the number of columns of the result will be $k$ points less. You the 1-d convolution function you created for Exercise 2-2. Remember to divide the final resul by the normalization constant.
  3. Apply your function to the gray-scale cat image with added Gaussian noise. Then apply your function with the same kernel to the transpose of the image.
  4. Display the filtered image and is discrete Fourier transform side by side as you did for Exercise 2-5.
In [60]:
# Define the 1-D kernel
kernel = np.array([1, 2, 1])

# Compute the outer product of the 1-D kernel with itself
d2_kernel = np.outer(kernel, kernel)
kernel_norm = np.sum(d2_kernel)

print("Outer product of the 1-D kernel with itself:")
print(d2_kernel)
print("Normalization constant:", kernel_norm)
Outer product of the 1-D kernel with itself:
[[1 2 1]
 [2 4 2]
 [1 2 1]]
Normalization constant: 16
In [61]:
def apply_kernel(image, kernel):
    result = np.zeros_like(image, dtype=float)
    for i in range(image.shape[0]):
        result[i, :] = np.convolve(image[i, :], kernel, mode='same') / kernel_norm
    return result

# Apply the function to the noisy gray-scale cat image
filtered_image_rows = apply_kernel(cat_grayscale + np.random.normal(scale=25, size=cat_grayscale.shape), kernel)
filtered_image_cols = apply_kernel(cat_grayscale.T + np.random.normal(scale=25, size=cat_grayscale.shape[::-1]), kernel)

# Compute the FFT of the filtered image
fft_filtered_rows = fftshift(fft2(filtered_image_rows))
fft_filtered_cols = fftshift(fft2(filtered_image_cols))

fig, ax = plt.subplots(2, 2, figsize=(15, 10))

ax[0, 0].imshow(filtered_image_rows, cmap='gray')
ax[0, 0].set_title('Filtered Image (Rows)')
ax[0, 1].imshow(np.log(np.abs(fft_filtered_rows)), cmap='gray')
ax[0, 1].set_title('FFT of Filtered Image (Rows)')

ax[1, 0].imshow(filtered_image_cols.T, cmap='gray')
ax[1, 0].set_title('Filtered Image (Cols)')
ax[1, 1].imshow(np.log(np.abs(fft_filtered_cols)), cmap='gray')
ax[1, 1].set_title('FFT of Filtered Image (Cols)')

plt.show()
  1. Does the outer product of the 1-d kernel with itself represent a good approximation of the truncated 2-d kernel?
  2. Compare the filtered image and Fourier transform of the filtered image to those of the unfiltered noisy image. What evidence can you see that the filter reduced the noise at the expense of blurring the image?
    End of exercise.

Answers:

  1. Yes, the outer product of the 1-D kernel with itself provides a good approximation of the truncated 2-D kernel, as this approximation allows us to efficiently apply separable filters, reducing the computational intensity while achieving similar filtering results compared to using the full 2-D kernel.
  2. The comparison between the filtered image and its Fourier transform to those of the unfiltered noisy image demonstrates evidence that the filter reduced the noise at the expense of blurring the image, leading to smoother but less detailed images.

Image Pyramids¶

We have explored the effect of changing filter bandwidth on images. This approach is effective if an image has a limited range of scales of interest. However, in many cases, images contain features of interest at many scales. To address these situations we need a general method for decomposing images into multiple scales. The fimage pyramid** is just such a technique.

An image pyramid uses Gaussian filters to down-sample the image by octaves or powers or 2; $[1/2, 1/4, 1/8, \ldots]$. This process creates a hierarchy of scales from an original image, the image pyramid. The pyramid is a data structure is therefore an convenient method to represent multiple scales of an image.

The pyramid is created by recursively Gaussian filtering the image and then sub-sampling by a factor of 2. In summary, the recursion works by starting with the original image and subsequently the last down-sample image to filter and down-sample. The down-sampling terminates either at a predetermined image size or when the resampled image is $1 \times 1$.

The code in the cell below creates an image pyramid for the cat image. First, the cat image is down-sampled to an even binary size. Then the image pyramid is constructed. Execute this code.

Note: It is most convenient to perform down-sampling by octaves on an image with even binary number dimensions. In fact, the skimage.transform.pyramid_gaussian can only work with images of even binary dimensions and will raise an exception otherwise.

In [50]:
print(cat_image.shape)
cat_image_resize = transform.resize(cat_image, (256,256))
rows, cols, dim = cat_image_resize.shape
cat_pyramid = tuple(pyramid_gaussian(cat_image_resize, downscale=2, channel_axis=2)) #, multichannel=True))
(300, 451, 3)

What does this image pyramid look like? To find out, execute the code in the cell below to print the dimensions of the down-sampled images and display them.

In [51]:
composite_image = np.zeros((rows, cols + cols // 2, 3), dtype=np.double)

composite_image[:rows, :cols, :] = cat_pyramid[0]

i_row = 0
for p in cat_pyramid[1:]:
    n_rows, n_cols = p.shape[:2]
    print('Shape of the image in pyramid = ', str(p.shape[:2]))
    composite_image[i_row:i_row + n_rows, cols:cols + n_cols] = p
    i_row += n_rows

fig, ax = plt.subplots(figsize=(18, 12))
ax.imshow(composite_image)
plt.show()
Shape of the image in pyramid =  (128, 128)
Shape of the image in pyramid =  (64, 64)
Shape of the image in pyramid =  (32, 32)
Shape of the image in pyramid =  (16, 16)
Shape of the image in pyramid =  (8, 8)
Shape of the image in pyramid =  (4, 4)
Shape of the image in pyramid =  (2, 2)
Shape of the image in pyramid =  (1, 1)

Exercise 2-10: Notice the dimensions of the down-sampled images along the pyramid. How does the resolution change along the members of the image pyramid relate to the scale of the visible image features?

Answer: As we move down the image pyramid, we observe that the resolution decreases and features become less detailed. Moreover, higher levels represent larger-scale features, while lower levels capture finer details.

Copyright 2022, 2023, 2024, Stephen F Elston. All rights reserved.¶